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Abstract 

>\ 

OO \ Directional clustering can be expected in cosmic ray observations due to purely 

t^ 
rn ' statistical fluctuations for sources distributed randomly in the sky. We develop an an- 

0\ ; 

^D , alytic approach to estimate the probability of random cluster configurations, and use 

O ■ these results to study the strong potential of the HiRes, Auger, Telescope Array and 

o . EUSO/OWL/AirWatch facilities for deciding whether any observed clustering is most 

O ' likely due to non-random sources. 



Introduction 

An unsolved astrophysical mystery is the origin and nature of the extreme energy cosmic ray 
primaries (EECRs) responsible for the observed events at highest energies, ~ 10^° eV [Q. 
About twenty events above -Egzk have been observed by five different experiments 0. The 
longitudinal profile for one of these events (the highest energy Fly's Eye event at 3x 10^*^ eV) is 
available; it favors a nucleon or nuclear primary over a photon primary [^. The origin of these 
events is a mystery, for there are no visible source candiates within 50 Mpc except possibly 
M87, a radio-loud AGN at ~ 20 Mpc. Since the observed events display a large-scale isotropy, 
many sources rather than one source seems to be required. The nature of the primary is a 



mystery, because interactions with the 2.73K cosmic microwave background (CMB) renders 
the Universe opaque to nucleons at these energies, and double pair production on the cosmic 
radio background (CRB) renders the Universe opaque to photons at these energies. The 
theoretical prediction of the end of transparency for nucleons at -Egzk ~ 5 x 10^° eV is the 
famous "GZK cutoff" §. 

Models for the origin and nature of the primaries may be put into two broad categories. 
In the first category are postulated sources of protons and photons "locally" distributed 
within 50 to 100 Mpc. For these models, the propagation problem is mitigated. However, 
the source problem is aggravated. In traversing a distance D, a charged particle interacting 
with magnetic domains having coherence length A will bend through an energy-dependent 
angleQ 

5^^0.5°x^^JZ}mpcAmpc. (1) 

Here BnG is the magnetic field in units of nanogauss, E20 and Z are the particle energy 
in units of 10^'^ eV and charge, and the lengths D and A are given in units of Mpc. It is 
thought likely that coherent extragalactic fields are nanogauss in magnitude ^, in which 
case super-GZK primaries from < 50 Mpc will bend only a few degrees, typically (but note 
that protons at 10^^ eV will bend through ~ 30°). Thus, local models either postulate 
many invisible sources isotropically-distributed with respect to the Galaxy to provide the 
roughly isotropic flux observed above Eqzk, or postulate a large extragalactic magnetic field 
to isotropize over our Northern Hemisphere the highest-energy particles^ from a small num- 
ber of sources ^. A common prediction for local models is little or no directional pairing 
on small scales, especially when events with energy ~ -Egzk are included with the 10^° eV 
event^. In particular, models invoking randomly distributed, decaying super-massive par- 
^On average, half of the interactions of a super-GZK nuclcon with the CMB change the isospin. At 
energies for which ct of the neutron is small compared to the interaction mfp of '^ 6 Mpc, the neutron 
decays back to a proton with negligible energy loss and the bending-angle formula is unchanged. However, 
at the energy 6 x 10^", ct for the neutron is comparable to the interaction mfp, so at higher energies the 

nucleon bending-angle is reduced by < 2. 

^Some models postulate helium or iron nuclei as the primaries, to increase the bending by the charge 

factors 2 and 26, respectively. 

■^There is the possibility of pairing due to magnetic focussing, if the projection of large-scale extragalactic 



tides (SMPs) as sources [^], and models invoking a large magnetic field with considerable 
incoherent component, predict a chance distribution of observed events on the sky. 

In the second category of models, the cosmic ray sources are put at cosmic distances 
( > 100 Mpc, i.e. 2 > 0.02). These large-z models mitigate the source issue. However, 
the propagation problem is exacerbated due to the increased distance. Most proposals of 
this type postulate some stable, charge-neutral primary having limited interaction with the 
radiation background. Examples include the neutrino (which may regenerate a local nu- 
cleon/photon flux via "Z-bursts" 0, or may develop a strong interaction at high energies 
||10|| , magnetic monopoles |Tl[], and the lightest SUSY baryon |jl2| (if the gluino mass is 
~ 1 GeV). Other large- 2; models employ broken Lorentz invariance or broken CPT-symmetry 
with effects generally suppressed by Mp^ factors but still large enough at _E > -Eqzk to sup- 
press the nucleon-CMB interaction |]13|. The nucleon-magnetic field interaction may also 



be suppressed, in which case these nucleon primaries, like the charge-neutral primaries, are 
not significantly bent by the intervening extragalactic magnetic fields. A common predic- 
tion, then, of large- 2; models is direct pointing of the primary's arrival direction back to the 
sourceQ. If a source is of sufficient intensity and duration, then the observation of multiple 
events from the direction of the source may be expected, beyond what is expected from 
chance coincidence. 

So it is seen that a major discriminator between the local and the large-2 models is the 
occurrence of directional pairing compared to that expected from chance coincidence alone. 
The AGASA experiment has already presented data strongly suggesting that directional 
pairing is occurring at higher than chance coincidence |]T6| . Comparisons of event directions 
in a combined data sample of four experiments further supports non-chance coincidences 



17 1 . Furthermore, the energy-time correlation within pairs seems to disfavor models with 
charged primaries originating from a common source of relatively short duration. This is 
because magnetic diffusion should lead to a later arrival time (more bending) for the lower 
energy charged-primary in the pair, contrary to what is observed in some pairs [^ . 



magnetic fields on our sky contains caustics [[7|, and if the incolierent magnetic fields are sufficiently small. 
^There is some evidence that the highest energy events may indeed point to distant compact radio-loud 

quasars H; however, more recent data do not seem to support the earlier result |15[. 



It is the purpose of this paper to estimate in a straightforward manner the significance 
of muhiple events in future data. We do so by providing an analytic calculation of the 
chance occurrence of directional multiples. It is relative to these chance probabilities that 
observed rates will determine the rise or fall of models. We present a formula for all possible 
multiples based on statistics alone, as a function of the total number of observed events, 
and the number of angular bins. We show that our analytic formula reproduces the pub- 
lished AGASA probabilities (obtained by Monte Carlo simulation) fairly well, even without 
inputting detailed knowledge of the experiment. We present the chance occurrence of di- 
rectional multiples for the HiRes experiment, and the next-generation Auger and Telescope 
Array experiments, and finally the proposed EUSO/OWL/AirWatch experiment. We show 
that for 20 events at HiRes, the observation of a triplet or two doublets at resolution of 2°or 
less is unlikely at the 3a level, as is the observation of two triplets or a quadruplet for 100 
events at Auger. For the EUSO/OWL/AirWatch facility, the anticipated very large number 
of events can be binned into a sizeable number of subsets. As will be shown, this process can 
allow an explicit quantitative test of the probabilities predicted for certain cluster groupings 
on the basis of purely chance coincidence. 

Formula for Chance Coincidence 

We now present the combinatoric formula for the probability of various event distributions 
in angle. Our formula is exact in the limit where the experimental efficiency for observing 
events is effectively uniform over the coverage of the celestial sky. We imagine that the sky 
coverage consists of a solid angle Q divided into A^ equal angular bins, each with solid angle 
u ~ 7t6'^ steradian; the number of bins with cone half-angle 6 is 

n (fi/l.O sr) 
A^ ~ = 1045 ^—^ (2) 

where Q is the solid angle (sidereal or galactic) on the celestial sphere covered by the ex- 
periment. We toss n events at random into these bins. As mentioned in the introduction, 
such a chance distribution of events is just what is expected in some models for the EECRs, 
e.g. randomly situated decaying SMPs, or charged-particle or monopole primaries traversing 
incoherent magnetic fields. 



Define each event distribution by specifying tlie partition of the n total events into a 

number mo of empty bins, a number mi of single hits, a number 7712 double hits, etc., among 

the A^ angular bins which constitute the total sky exposure. The probability to obtain a 

given event topology is: 

1 N\ n\ 

P{{mi}, n,N) = — ^^1 ^j ^^1 ^^, ^ ^ ^ (0!)™o (l!)-i (2!)-2 (3!)™3 . . . " ^^^ 

The A^! and n\ factors in the numerator count the permutations of the bins and the events, 
respectively. The rrijl and j! factors in the denominator remove the overcounting of those 
bins containing j events, and the events within those bins, respectively. The normalization 
factor A^" is total number of ways to partition n events among A^ bins. 

The variables in the probability are not all independent. The partitioning of events is 
related to the total number of events by 

J2j xmj = n, (4) 



and to the total number of bins by 



Y.m,=N. (5) 



Because of the constraints in eqs. (§) and (H), we infer that the process is not described by 
a simple multinomial or Poisson probability distribution. 

It is useful to use eqs. (|[) and (|^) to rewrite our exact probability (^ as 

. r . ^ A^! n! „ (rrn) ""^ 

where we have defined 

In the n -^ N limit, mj is expected to approximate the mean number of j-plets, and eq. 
(^) becomes roughly Poissonian. As an approximate mean, WTj defined in eq. (^ provides a 
simple estimate of cluster probabilities due to chance for the n <^ N case. 

Scaling laws relating mean cluster numbers to total event numbers and binning angles 
may be derived by inserting eq. (^ into eq. (|^. Results are 
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and 

rrTj n nnO'^ , . 

— ^ = . (9 

These scaling laws may be used as a further test of the randomness of clustering in a data 
sample. For example, the angular binning may be artificially expanded from the experimental 
resolution to see if the cluster numbers follow the 6'^^^~^^ scaling law. Of course, signal-to- 
chance is optimised by choosing the binning angle close to the natural angle of the source 
on the sky. 

In the next section we examine the simplification of the exact formulas (|^) and (j^) that 
results in the large N,n limit. Even in the N '^ n ^ 1 limit, the resulting approximate 
formula will be seen to be not quite Poissonian, since the variables in the set {mj, n, N} are 
not independent. 

Large N^ n Limit 

Two large- number limits of interest are N ^ n ^ 1, and n > N ^ 1. With bin numbers 
typically ~ 10'^, the first limit applies to the AGASA, HiRes, Auger and Telescope Array 
experiments; the second limit becomes relevant for the EUSO/OWL/AW experiment after 
a year or more of running. 

Approximation for A^ ;:^ n ^ 1 

When N ^ n, the number ttiq of empty bins is of order N, and the number of bins rrii with 
single events (singlets) is order n; the number of clusters (doublets, triplets, etc.) is small. 
It is sensible to explicitly evaluate the not-so-interesting j = and 1 terms in eqs. (D and 
(0). With the use of Stirling's approximation for the factorials, one arrives at a simple form 
for the probability, valid when N ^ n ^ 1: 



P{{mi},n,N) K.V 



n 



\m 



rnj]_^ jn-ri(j-2y. 



j=2 ^r- 



(10) 



where r = [N — mo)/n ^ 1, and the prefactor V is 



n 



V = e-("-"^i) — . (11) 
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In the "sparse events" case here, where N ^ n, one expects the number of singlets mi to 
approximate the number of events n. In this case the prefactor is near unity. The non- 
Poisson nature of Eq. (p!OD is reflected in the factorials and powers of r in the exponents, 
and the deviation of the prefactor from unity. In our numerical work, we will provide curves 
for the exact result, and for the Poisson approximation (obtained from our expressions ( p!0D 
and (^ by setting V to unity and omitting r^ (j — 2)! in the exponent, and by neglecting 
the constraints of eqs. (^ and (^. Note that in this Poisson approximation, rnj as defined 
in Eq. (|^) is truly the mean number of j-plets. 

Approximation for n > N ^ 1 

In the case where n > N ^ 1, higher j-plets are common and the distribution of clusters 
can be rather broad in j, since according to Eq. (^). 

Already at j = 1 (2), Stirling's approximation to j! is good to 8% (4%), and so we may 
write my in the approximate form for j > 1: 



iV3 / enV^ 
27ien \jN 
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^^\/7Trzr( — I • (12) 



Extremizing this expression with respect to j, one learns that the most populated j-plet 
occurs near j ~ n/N. Combining this result with the broad distribution expected for large 
n/N, one expects clusters with j up to several x -^ to be common in the EUSO/OWL/AW 
experiment. 

Comparison with Experiments 



There are two ongoing EECR experiments, AGASA ||T9| and HiRes [^ . There are two larger 
area experiments under development. Auger ||2l| and Telescope Array (TA) |2^. Finally, 



there are experiments proposed with still larger areas, EUSO [^], OWL p4[ and AirWatch 
P3| . A non- hostile merger of these latter experiments appears likely, so we refer to them 
collectively as EUSO/OWL/AW. In Table 1 we list the relevant parameters defining for our 
purposes each of these experiments. 



Table 1: Typical values of effective area A, celestial solid angle il [^, and angular resolution 
^min for the existing and proposed EECR experiments. The incident flux F(> -Egzk) = 
10~^^cm~^s~^sr~^ has been used to estimate the number of events above Egzk = 5 x 10^^ eV. 



Experiment 


AGASA 


HiRes 


Auger/TA 


EUSO/OWL/AW 


A (km2 sr) 


150 


800 


6x 10^ 


3x 10^ 


n/jT = AxF{> Egzk) 


5 


30 


200 


10^ 


n{sT) 


4.8 


7.3 


4.8 


An 


^rnin 


3.0° 


0.5° 


1.0° 


1.0° 



We proceed to normalize the analytic approach described above against the Monte Carlo 
method used by the AGASA Collaboration [|1^], and then to provide some concrete examples 
for future observation. In searching for cluster probabilities using Monte Carlo, a fixed num- 
ber n of events is tossed into phase space, and clusters are defined by choosing a correlation 
angle 6' (in principle independent from the resolution angle discussed earlier). For example, 
a doublet is registered when an event falls within an angle ^' of a preceding event. In such 



a manner, the AGASA Collaboration |T6| found that 92 events yielded 12 or more doublets 
in 1.5% of their trials. Their simulations utilized a correlation angle ^' = 3° and declination 
angles (roughly) between 10°and 70°. This gives a solid angle Q ^ 4.8 steradian. If for the 
moment we identify the correlation angle 6' with the resolution angle 6, then from Eq. (|^), 
with 6^ = 3°, we find A^ = 557. Eq.(H) then gives 



P(m2 > 12, mg > 0, m4 > 0, 92, 557) = 1.4% . 



(13) 



Considering the crudeness of the approximations, this result agrees with the 1.5% simulation 

result .0 

For Auger, we adopt the same solid angle {Q = 4.8 sr). For HiRes, we estimate on 

geometric grounds a solid angle of 10.9 sr |2^. The same geometric estimate for AGASA 

gives a solid angle of 6.9 sr, which the acceptance profile reduced to the above-mentioned 

4.8 sr. For the sake of our modeling, we apply the same reduction factor (4.8/6.9) to the 
^Note that if our estimate of the soUd angle were to change somewhat, agreement with the Monte Carlo 
results could still be achieved by introducing a slight deviation from the assumed equality 6' = 6. 
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HiRes geometry, to arrive at an effective solid angle of 7.3 sr (Table I). The purely statistical 
probabilities for various cluster topologies can now be calculated as a function of the angular 
resolution and the accumulated number n of events. 

Discussion of Results 



HiRes 

For the HiRes experiment, about 20 events at 10^° eV are expected when the first full year's 
data is analyzed. We calculate the inclusive probabilities for one or more, two or more, 
and three or more doublets; and one or more triplets, over a range of angular binning using 
Eqs. (I^)and (0). Note that by "inclusive probability" we mean the stated number of j-plets 
plus any other clusters] e.g. a topology with two doublets and one triplet counts as one 
doublet, as two doublets, and as one triplet in the inclusive probabilities. The results are 
displayed in Figure 1. 

,0 




1.5 2 

Correlation angle 9 (deg) 



Figure 1: Inclusive probabilities for various clusters, given 20 events 
at HiRes. The solid line is the exact result, the dashed line is the 
Poisson approximation. 
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Several coininents may be made with reference to this figure: 

(a) For all except the 3 doublet configuration, the Poisson approximation using the mean 
values in Eq.(|^ provides an estimate good to within 50% of the non-approximate form; 
for the (much suppressed) 3 doublet configuration, it overestimates the probability by 
about a factor of 3 in much of the angular region. 

(6) For angular binning tighter than 2°, an observation of two doublets among the first 20 
events has a chance probability of less than 0.5%. Thus the observation of this topology 
could be construed as evidence (at the 3a level) for clustering beyond statistical. The 
observation of a triplet within < 3° has a random probability of less than 10~^, and 
hence observation of such a triplet would most likely signify clustered or repeating 
sources, or magnetic focussing effects. 

(c) With the accumulation of 40 events (not shown in the figure), the appearance of two 
doublets has a probability of less than 0.5% for a correlation angle of l°or less. This il- 
lustrates how the good angular resolution of HiRes may be used to detect non-statistical 
clustering with only a few observed clusters. 

Auger 

Coming now to Auger, we present in Figure 2 the probabilities of observing 8 doublets, one 
triplet, and two triplets, in an event sample of 100 events. 
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10 



1 triplet 




i2=4.8 sr 



1.5 2 

Correlation angle 9 (deg) 



2.5 



Figure 2: Inclusive probabilities for various clusters in a 100 event 
sample at Auger; solid (exact), dashed (Poisson). 

This graph illustrate several points of interest: 

(a) The 8-doublet probability is extremely sensitive to the angular binning, and thus un- 
certainties of the order of 0.5°in the region of ^ < 2° would preclude assigning a baseline 
chance-probability to better than an order of magnitude for this topology. This uncer- 
tainty may be avoided by breaking down the 100 events into smaller data sets. On the 
other hand, it may be possible to use the sensitive dependence displayed here to advan- 
tage: observation of a flatter dependence on angular bin-size could signal a non-chance 
origin for the clustering. 

(6) As in Fig. 1, it is seen that the Poisson approximation is good for some topologies, but 
an overestimate for others; for 8 doublets, it overestimates the probability by about 3 
in much of the angular region. 

(c) The observation of two triplets with angular binning of less than 2°would have a random 
probability of less than 0.5%, and hence could be construed as 3-sigma evidence for a 
novel astrophysics, such as clustered or repeating sources or magnetic focussing. 
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(d) Not shown in the figure is the probabihty of observing a quadruplet, which turns out 
to be about 0.5% for a binning angle of 2.5°. Hence the observation of a quadruplet 
within 2.5° among the first 100 Auger events would be suggestive of clustering due to 
an astrophysical cause. 

EUSO/OWL/AW 

The large number of events expected in the EUSO/OWL/AW experiment presents an ex- 
traordinary opportunity to test for non-random clustering, but also a numerical problem 
for both our formula and for a direct Monte Carlo simulation. With 10^ super-GZK events 
(Table I) disributed among 10^ bins, 10-plets and beyond may be common ocurrences (see 
eq. (p!2[) ). One approach to the very large data sample is to envisage the 10'^ super-GZK 
events divided into 500 more manageable subsamples of 20 events each (there are many ways 
to choose the event partitions, and some subtleties are involved.). Then the probability pre- 
dictions for clustering in the random model can be straightforwardly assessed. For example. 
In Figure 3, we show the inclusive probabilities for 1 and 2 doublets in 20 events, for the 
EUSO/OWL/AW sky aperture. 
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Figure 3: Probabilities for various topologies in a 20 event subsaniple 
at EUSO/OWL/AW 

We can see that for binning angles of 2.5°, we expect about 10% of the 500 20-event 
subsamples, or about 50 subsamples to have one doublet, and we expect about 0.3%, or 
perhaps 1 or 2 samples to have 2 doublets. Comparing subsamples in this way, deviations 
from random clustering can be quantitatively assessed. Of course, it is possible that in the 
large sample of EUSO/OWL/AW, some non-random high-j clusters will emerge far above 
background. In such a case, the random probabilites presented here become much less 
relevant. 

The same large data-set approach just described could also be used for Auger after a few 
years of running time, although with somewhat fewer statistics. 



Summary and Concluding Remarks 

We have presented an analytic study of clustering for cosmic rays based on a random angu- 
lar generation of events in the sky. Our probability formula is based on randomly assigning 
events into fixed angular bins with uniform a priori probability. In reality, the efficiency of 
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experiments for observing events is not uniform across the sky coverage. For this reason, 
the most careful quantitative statements about clustering probabilities of existing data must 
come from Monte Carlo simulations incorporating experimental efficiencies. Nevertheless, 
our results are in good agreement with the prior Monte Carlo study by the AGASA Collab- 
oration |T^ . For this reason, we believe that our formula makes a significant advance to the 



field, and is especially useful in predicting event topologies for future experiments and larger 
data samples. 

We found that the use of Poisson distributions, with mean values given by Eq. (^, was 
approximately valid for some topologies, but yielded overestimates for others. For some of 
the interesting cases discussed in the preceding section, the Poisson estimates were factors 
of 2 to 3 larger than the exact probabilities. 

Results for the HiRes, Auger and EUSO/OWL/AW experiments were presented. These 
results reveal which topologies occur with probabilities of 0.5% or less in the various ex- 
periments; observation of these topologies would constitute evidence at the 3cr level for 
astrophysical rather than random causes for the clustering. An observation of two or more 
doublets in the first 20 events at HiRes, each doublet consisting of 2 events within less than 
2°of each other, is one example shown in the text. 

Topologies with highly suppressed chance probabilities are especially sensitive probes 
of non-random clustering. This situation is exemplified in our discussion of the 8-doublet 
topology for the Auger experiment with 100 events. Highly suppressed topologies may be 
rather difficult to use in practice since they exhibit extreme sensitivity to binning angle, which 
leads to great ambiguity in their statistical significance. On the other hand, this sensitivity 
may be useful as a diagnostic to distinguish between random clustering vs. clustering due to 
astrophysics. 

The large number of events previewed in the proposed EUSO/OWL/AW experiment (and 
to a lesser extent, at the Auger facility) presents an analytical challenge. Partitioning of the 
total data sample into subsamples, and then comparing these subsamples, would provide a 
direct test of the purely statistical predictions for clustering. 

We have not included any source modeling in our analysis. Our chance probabilities 
describe arrival directions randomly distributed on the celestial sky. In fact, this distribution 
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is reality for some models, such as decaying SMPs, and charged primaries with directions 
randomized by incoherent cosmic magnetic fields. A complementary approach to our work is 
to consider specific source models generating non-random angular distributions. Steps along 



this line of inquiry have recently been taken p^. Future progress in the field will involve 



comparisons of the random and non-random model predictions with the data. 
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